Effect of congestion avoidance due to congestion information provision on optimizing agent dynamics on an endogenous star network topology

This study elucidates the effect of congestion avoidance of agents given congestion information on optimizing traffic in a star topology in which the central node is connected to isolated secondary nodes with different preferences. Each agent at the central node stochastically selects a secondary node by referring to the declining preferences based on the congestion rate of the secondary nodes. We investigated two scenarios: (1) repeated visits and (2) a single visit for each node. For (1), we found that multivariate statistics describe well the nonlinear dependence of agent distribution on the number of secondaries, demonstrating the existence of the number of secondaries that makes the distribution the most uniform. For (2), we discovered that congestion avoidance linearizes the travel time for all agents visiting all nodes; in contrast, the travel time increases exponentially with secondaries when not referring to congestion information. Health examination epitomizes this finding; by allowing patients to be preferentially selected for examination in vacant examination sites, we can linearize the time it takes for everyone to complete their examination. We successfully described the optimization effect of congestion avoidance on the collective dynamics of agents in star topologies.

Network science has been acknowledged as a crosscutting discipline that pursues the universal nature of network characteristics. Fundamental research on network topologies has been particularly significant for many applications 1 . Namely, the knowledge in network topologies is beneficial not only for a specific field but also for various areas in science, engineering, and technologies. Examples include the Internet, telecommunications, ecological systems, and transport networks [2][3][4] . It is necessary to analyze structural properties, such as the average path lengths or node degrees [5][6][7] . Furthermore, there is a strong demand to explore transport properties, which are the distribution of agents and their traffic in topological networks, where agents usually represent creatures, humans, or objects. Graph theory 8,9 and queuing theory 10,11 are effective tools for the theoretical analysis of traffic transportation [12][13][14][15] . However, the applications of these theories to complex real-world transport networks are limited. As a remedial measure, cellular automata 16,17 or multi-agent simulations have been utilized as supportive but powerful approaches to complex network problems [18][19][20] .
Elucidating the effect of agents' congestion avoidance on their collective dynamics in basic topologies can help deepen the understanding of the mechanism of "traffic jams" in various network sciences. However, such an effect has not been fully discussed in multivariate statistics. In studies of traffic transport networks, there have been two different approaches to analyzing network problems: conceptual and semi-empirical. The former examines the basic topologies or conceptual network models to grasp the essence, whereas the latter approach develops generic models and adapts them to complicated phenomena by optimizing parameters according to experimental data. Many related studies have primarily focused on applications in engineering; thus, they used semi-empirical approaches [21][22][23][24] . For example, several studies have parameterized multiple types of traffic information [e.g., www.nature.com/scientificreports/ because they can be direct applications of the findings of this study. Therefore, as mentioned above, we decided to measure the characteristics of interest in each scenario: the uniformity of the agent distribution in the steady state for Scenario-1, and the travel time of agents visiting all nodes for Scenario-2.
The remainder of this paper is organized as follows. In section "Methods", we provide a detailed description of the target system. We also derive analytical solutions for a specific case in which agents are provided with no traffic information by solving the state-transition equations. We then report the results of small preliminary simulations in both scenarios to clarify the following two characteristics: (a) the equalization of network usage owing to congestion avoidance of agents observed in Scenario-1. (b) the shortening effect of travel time due to the equalization effect observed in Scenario-2. Considering the results of the preliminary tests, we define a criterion indicating the uniform distribution of agents based on the above-mentioned analytical solution as a preparation for the subsequent sections. In Section "Results", we investigate the uniformity of agent distributions in Scenario-1. We observed that the uniformity of agent distribution shows three types of nonlinear dependence on the increase in nodes. We also report that congestion avoidance linearizes the travel time irrespective of the degree of reference to the congestion information in Scenario-2. In contrast, the travel time increases exponentially with the number of secondary nodes when not referring to congestion information at all. In Section "Discussion", we derive a theoretical model based on multivariate statistics and compare it with the simulation results, reporting that our model clearly describes the observed characteristic dependence in Scenario-1. In particular, our analysis indicates the following: the balance between the equalization of network usage by avoiding congestion and the covariance caused by mutually referring to congestion information determines overall uniformity. We also discuss why the travel time step exponentially increases when agents are provided with no congestion information and are linearized otherwise. Section "Conclusion" summarizes the results and concludes the study.

Methods
Model. Consider a star topology with a primary node connected to M isolated secondary nodes with different preferences. We denote the total number of agents and the number of agents of the ith secondary node at the t discrete-time step and the fixed preference for the ith secondary node by agents as N v , N i (t) , and w i , respectively. For ease of understanding, we refer to w i as a weight or the ith preference. Each agent at the primary node stochastically chooses one of the secondary nodes by referring to the preferences affected by the congestion rate of each secondary node. Accordingly, the probability that an agent moves from the primary node to the ith secondary node is expressed as where the parameter A determines the moving rate from the primary node to either of the secondary nodes and controls the outflow of agents among N v agents from the primary to secondary nodes. By contrast, we set the return rate from the secondary nodes to the primary node as the l multiple of A; in other words, parameter l controls the inflow from the secondary nodes to the primary node. Parameter α determines the effect of avoiding congestion on the decline in preference w i . Figure 1A shows the schematic of the target system from the viewpoint of decision-making; Each layer in the horizontal direction represents all possible choices in decision-making by an identical agent. For instance, the yellow L j th layer has M secondary nodes that the jth agent can select. s i indicates the primary node when i = 0 , and otherwise indicates the ith secondary node. Figure 1B depicts all the possible states that an agent can take, corresponding to the yellow layer in Fig. 1A.
In the target system, all agents begin at the primary node. Each agent at the primary node stochastically selects one of the secondary nodes by referring to the preferences affected by the congestion rate of each secondary node. We examined the following two scenarios: (1) each agent can access the same secondary node repeatedly and (2) each agent can access each secondary node only once. We refer to the former scenario as Scenario-1 and the latter as Scenario-2. We investigated the uniformity of agent distribution in the stationary state in Scenario-1 and measured the travel time for all agents visiting all the nodes in Scenario-2. In Scenario-1, the average outflow of the number of agents moving from the primary to secondary nodes is controlled by parameter A. By contrast, in Scenario-2, the total probability is normalized among all the secondary nodes, and the agents leave out the probabilities of choosing the already visited secondary nodes. If an already visited node is selected in a trial, the system discards the trial; it can be considered a call-loss system in Scenario-2. All the agents are updated simultaneously in both scenarios. We introduce the following physical value U to evaluate usage in each node: where U f represents the number of agents in a secondary node after reaching a stationary state, and N v denotes the total number of agents. In general, it is difficult to theoretically prove the existence of the stationary state in endogenous traffic networks. In this study, we measured U in a wide parameter range and confirmed that the system reaches the stationary state within the measured ranges; all simulation results in this study were measured after the system was confirmed to reach the stationary state. Regarding travel time in Scenario-2, we count the number of time steps required for all agents to complete visiting at all secondary nodes and refer to this as T s . Accordingly, we investigate the dependence of these two features U and T s on the major parameters of the system.
(1) www.nature.com/scientificreports/ We then discuss these mechanisms from the perspective of multivariate statistics. The method for evaluating the uniformity of the distribution is described in Section "Evaluation of the uniformity of agent distribution". www.nature.com/scientificreports/ Agent distribution. As depicted in Fig. 1B, each agent assumes one of the M + 1 possible states. The geometric feature of the star topology leads to the following rules: the state at which an agent exists on the primary node, S 0 , can be transitioned from one of the states between S 0 and S M . In addition, each of the states between S 1 and S M only transitioned from state S 0 . Accordingly, the state at time step n + 1 is determined by the transitions from one of the M + 1 states at time step n; therefore, the state transition equations between time steps n + 1 and n can be described as S (n+1) = TS (n) , where S (n) is a state vector at time step n, represented by where S n x (x = 0, 1, · · · , M ) represents the state that an agent exists on the xth node at time step n; each element of S (n) corresponds to the state depicted in Fig. 1B. T is a state transition matrix of size M + 1 × M + 1 . The element e ij of T in the ith column and the jth row is expressed as The relationship S (n+1) ≈ S (n) was established after reaching a stationary state. The exact expression of the state vector S (n) is obtained by solving the state transition equations as follows: where c is the indeterminate parameter. Although S (n) is a state vector, we can regard it as a probability distribution by setting c = l , which is derived from the normalization condition of |S (n) | = 1 . However, we can see the difficulty of solving this problem; In case α = 0 , the system becomes endogenous; Namely, the number of agents at a stationary state is designated by the state S (n) whereas S (n) itself includes the number of agents N i (t) , as confirmed by the right-hand side of Eq. (1). The agent distribution at the steady state can be obtained because of this autonomous self-optimization. Let us elaborate on this point further. If the system is exogenous, it is possible to estimate the agent distribution using Eq. (1) and Eq. (5) after reaching the stationary state; however, our system is endogenous when α = 0 . Namely, the resulting agent distribution is fed back into the input congestion information in the second term in the curly bracket of Eq. (1). In such cases, we must consider the case that the deviation ǫ c of the indeterminate parameter c exists as c =ć + ǫ c while ǫ c is sufficiently small that the relation S (n+1) ≈ S (n) still holds. Because the system is endogenous, ǫ c propagates and amplifies in the time direction, affecting the state of the system after a long period; eventually, ǫ c can cause the nonlinearity of the system. However, because ǫ c is unpredictable, it is impossible to estimate its effect theoretically. Therefore, performing numerical simulations is essential for investigating the system dynamics. In addition, the solution always includes an indeterminate parameter. In a wider sense, the target system can be said to be the Diophantine problem 48 . Meanwhile, in the case of α = 0 , the second term in the curly braces in Eq. (1) is always zero. Therefore, the system becomes exogenous, and we can obtain the exact solutions for Scenario-1 by setting c to l such that S (n) satisfies the normalization of |S (n) | = 1.
As a preliminary test for Scenario-1, we focused on a problem using M = 3 . We set the preferences ( w 1 , w 2 , w 3 ) to (1,2,3), where the subscript x of w x corresponds to the index of the secondary node. We set the parameters ( N v , A) to (1,000, 0.1) in this test. We set the number of time steps for each simulation to 1,000, which was confirmed to be a sufficient number of time steps for reaching a stationary state in preliminary tests. Then, we measured the usage of each secondary node for different values of l and α . Figure 2a shows the comparison of simulations with our theoretical model in Eq. (5) when α = 0 , for different values of the parameter l between 0 and 3.0. The solid lines and circle symbols in Fig. 2a respectively represent the simulations and model values, which agreed well in every measured range. As previously mentioned, we cannot obtain analytical solutions for all cases of α = 0 because the second term in the curly braces of Eq. (1) includes the number of agents N i (t) . Accordingly, we performed simulations for α = 2.5 as an example of α = 0 for different values of l between 0 and 3.0 and compared them with the model values of Eq. (5) with α = 0 . The results are presented in Fig. 2(b) on a logarithmic scale, where the solid lines indicate the simulation results and the symbols represent the plots of the model in Eq. (5): Here, we made some interesting observations; as shown in Fig. 2b, the usage U of the first node was found to approach that of the second node, which had a neutral preference. In addition, U in the third node, which had the largest preference, approached the second node from the opposite direction. Namely, the usages of the three nodes were almost equal regardless of parameter l.
Similarly, we performed simulations for l = 2.0 for different values of α between 0 and 3.0 and compared them with the model values of Eq. (5) with α = 0 . The results are presented in Fig. 2c. As with the results in the l direction, the usage U of the first node was found to approach the second node, and that of the third node . . . www.nature.com/scientificreports/ approached the second node from the opposite direction as α increased. The increase in α mitigated the imbalance and facilitated the equalization of usage U among nodes. To summarize the results of (b) and (c), we can say the following: even though the secondary nodes were set to have non-uniform preferences, the network behaved uniformly under specific conditions. In this paper, we refer to this observation as the "equalization effect". In Section "Evaluation of the uniformity of agent distribution", we introduce a criterion to evaluate the uniformity of the agent distribution resulting from the equalization effect.
Evaluation of the uniformity of agent distribution. We define the imbalance of the system as the deviation of the state vector S (n) in a stationary state from an ideal state vector S (n) h , where each secondary node has the same number of agents. We directly obtain the state vector S Travel efficiency on the network. As a preliminary test for Scenario-2, we investigated T s , the number of time steps required for all agents to complete visiting all secondary nodes for the system with M = 3 , similarly to Scenario-1. We set the parameters ( N v , M, A, l, α ) to (1,000, 3, 0.1, 2.0, 2.5). Figure 3A shows the change in the usage U of each secondary node during T s . The blue chain lines indicate the results in the case of α = 0 , where each agent stochastically chooses its destination by referring only to fixed preferences. By contrast, the red solid lines represent the results when α = 2.5 , where each agent stochastically chooses its destination by referring to the preferences affected by the congestion rate of each secondary node. The circular, square, and triangular symbols represent the first, second, and third nodes, respectively. Importantly, usage U tends to change moderately . . .  shows the dependence of T s on the parameter α . We can confirm that T s drastically decreases soon after α becomes greater than zero, reaching a plateau at around α > 1.5.
In summary, it was confirmed from two preliminary tests for Scenario-1 and Scenario-2 that avoiding congestion can promote the equalization effect in both steady-state agent distribution and travel efficiency. Accordingly, in the following sections, we examine the equalization effect for general cases through simulations over a wider range of the number of secondary nodes, M, agents N v , and the parameter α.

Results
Simulation results for Scenario-1. Figure 4 shows the dependence of the imbalance ratio I h on parameter M for different values of N v and α in Scenario-1. The dependence of I h on M can be divided into three cases. The black line represents the case of uniform preferences with α = 0 , where we set weight w i to 1 for all i. We refer to the results of the black lines as (A). The blue line represents the case of weighted preferences with α = 0 , where we set weight w i to i for the ith secondary node. We refer to the results of the blue lines as (B). The red lines represent the cases of weighted preferences with α = 0 , where we set weight w i to i for the ith secondary node, as in case (B). We refer to the results of the red lines as (C). In the legend, ( W t , α , N v ) represents the set of preference types, the parameter α , and the number of agents N v , where W indicates weighted preferences and U represents uniform preferences. Specifically, in case (A), we measured the dependence of I h on parameter M for different values of N v between 1, 000, 2, 000, 5, 000, 10, 000, and 30, 000 while keeping α constant at zero. It was observed that I h decreases as N v increases, and we obtain four different black curves. In case (B), we set the same parameters as (A) for N v = 1, 000 , except for the preferences mentioned above. In this case, I h was observed to be greater than in the case of (A) with N v = 1, 000 in all areas of parameter M. In case (C), we set the same conditions as in case (B), except for α = 0 . Consequently, I h surges in a small area of α and increases as its minimum value M increases. Thereafter, I h converges to case (A) when N v = 1, 000 , regardless of the degree of α , and the strength of the surges in I h becomes larger as α decreases. The reasons for the peculiar dependencies shown in Fig. 4 from the viewpoint of the multivariate statistics are discussed in Section "Discussion".
Simulation results for Scenario-2. We measured the dependence of travel time steps T s on parameters α , M, and N v in Scenario-2. First, we set N v to 1, 000 and then measured the dependence of T s on parameters α , and M. Figure 5a shows the dependence of T s on parameter α for four different values of M from 3 to 49 on the semilogarithmic scale in the case of weighted preferences, where we set the weight w i to i for the ith secondary node. As observed in the preliminary tests with M = 3 , T s decreases soon after α becomes greater than zero and reaches a plateau in all cases of M. It was also observed that the difference in T s between the cases of α = 0 and α = 0 increases as M increases. Figure 5b shows the dependence of T s on the parameter M in various α ranges between 0 and 3.0 on the linear scale in the cases of both weighted and uniform preferences. Notably, T s was observed to increase exponentially when the preferences were weighted with α = 0 . By contrast, T s linearly increased to a similar degree when the preferences were otherwise weighted with α = 0 or uniform. An interesting point is that we obtain the same results as uniform preferences when α = 0 , which is almost irrelevant to the degree of parameter α . In addition, Fig. 5c shows the dependence of T s on the number of agents N v for α = 0 and for α = 2.5 , which is a representative case of α = 0 . By fitting each case using a function of y = alog(x) + b where a and b are constants, it was found that T s increased more moderately when α = 0 compared to when α = 0 ; We confirmed from Fig. 5c that the stability of the network in the N v direction also increased because of the equalization effect.  51 , and data analysis in experimental particle physics 52 . In this section, we describe the uncertainty of imbalance ratio I h from the viewpoint of multivariate statistics. In the definition of the imbalance ratio I h in Eq. (7), S n h,i is a deterministic value because S n h,i holds only constant parameters: preference w i , control parameter α , and number of secondary nodes M. By contrast, S n i is a stochastic variable because it holds N i (t) , as confirmed from Eq. (1) and Eq. (5); accordingly, I h can be represented as a function of the set of stochastic variables S n i ( i = 1, 2, · · · , M ) as follows: where S i is an abbreviation for S n i and I h is a mapping from the set of ( S 1 , S 2 , · · · , S M ) from a mathematical perspective. According to the multivariate statistics, the propagation of uncertainty σ I h of I h is described as follows: where ∂I h /∂S i represents the partial derivative of I h with respect to variable S i . The center bracket represents the variance-covariance matrix of the system (hereafter referred to as E ). σ i 2 represents the variance of I h with respect to variable S i , and σ ij indicates the covariance of I h between variables S i and S j . Equation (9) can be expressed in scalar form as (8) I h (S 1 , S 2 , · · · , S i , · · · , S M ), www.nature.com/scientificreports/ The first and second terms inside the square root of Eq. (11) are the contributions of the diagonal and non-diagonal components of the variance-covariance matrix E , respectively. In matrix E , the covariance σ ij becomes zero when the variables N i and N j are uncorrelated, and the second term inside the square root of Eq. (11) vanishes if all the variables are uncorrelated. According to the theory of errors, the measured value I h can be decomposed into the mean value I h and the uncertainty σ I h as I h = �I h � + σ I h . When α = 0 , S i and S n h,i become equal since the second terms in r i in Eq. (1) and r i in Eq. (6) vanish; from the definition of I h in Eq. (7), the mean I h can be estimated as zero. Meanwhile, when α = 0 , the number of agents in each secondary node is equalized after reaching a stationary state owing to the equalization effect. Accordingly, we can assume that the difference between S n i and S n h,i becomes sufficiently small to be negligible, and the relationship of S n i ≈ S n h,i can be established for a sufficiently large α . In other words, I h can be approximately zero or a small number ǫ . In summary, we obtain the following relationship: When α = 0 , each agent chooses its direction only by referencing fixed preferences. In this case, all resulting statistics of ( N 1 , N 2 , · · · , N M ) in the secondary nodes are uncorrelated. Because of the linear transformation relationship, all stochastic variables ( S 1 , S 2 , · · · , S M ) are also uncorrelated. Accordingly, the second term in the square root of Eq. (11) vanishes, the covariances in the non-diagonal components of E are zero; we obtain the value of parameter σ I h only from the first term in Eq. (11). Because we can calculate ∂I h /∂S i by differentiating Eq. (7) with respect to S i , the remaining parameter that must be estimated is the deviation σ i . By contrast, when α = 0 , we need to estimate σ ij in addition to σ i because of the emergence of the correlations among secondary nodes. Here, each agent selects the ith node from among the M secondary nodes with probability r i and avoids the ith node with probability 1 − r i at each time step, as mentioned in Section "Model". Therefore, the system follows a Bernoulli trial, where the deviation is represented by σ = Np(1 − p) . N indicates the number of statistics and r represents the probability. The details of the derivations of σ i and σ ij in each case are given as follows.
In case (A), we can replace p with S n h,i because it represents the probability of finding an agent in the ith secondary node in a stationary state. In addition, the total number of statistics is proportional to N v M , which is the number of secondary nodes multiplied by the total number of agents in the target system. We introduce a constant parameter σ 0 . Consequently, an approximation for the uncertainty σ i is represented as follows: In cases (B) and (C), it is necessary to modify Eq. (13) owing to weighted preferences. The serial indices are set to the secondary nodes as linearly weighted preferences; the ith node has weight i/W s , where W s is the sum of the numbers from 1 to M, which is M(M + 1)/2 . When two variables X and Y have a linear relationship Y = aX + b , their variances σ 2 Y and σ 2 X satisfy σ 2 Y = a 2 σ 2 X . Therefore, the variance of the ith secondary node for weighted preferences differs from that for uniform preferences. We estimate the variance in the ith secondary node as follows: We consider a linear transformation of S n h,i from the state vector S n h,i as a = (S n h,i − b)/S n h,i , where S n h,i is a state vector with uniform preferences of α = 0 obtained by setting w i to 1 for all i, which is expressed as S n h,i = 1/(l + 1)M . We modify Eq. (13) to be proportional to parameter a as follows: where we introduce a constant factor µ in addition to the parameter b for a simple expression.
In case (C), it is necessary to calculate the covariance between the secondary nodes. In this case, agents choose their destinations by referring to the congestion rates of all secondary nodes, which suggests that the statistics in a secondary node depend on the congestion status of the other secondary nodes. In other words, the statistics of the different secondary nodes correlate with each other. Accordingly, it is necessary to calculate both the diagonal and non-diagonal components of the variance-covariance matrix E . We estimate the values of the covariance components of E as follows: First, we recall that there is a general relationship between the two correlating stochastic variables S i and S j as σ ij = �S i S j � − η i η j , where σ ij is the covariance between S i and S j , η i and η j represent the means of S i and S j , and S i S j is the mean of the multiples of S i and S j . We evaluate η x (x = i, j) by its arithmetic mean, which can be obtained by summing up S n h,i for i from 1 to M and dividing it by M; the resulting η x is 1/(l + 1)M , which yields the relationship σ ij = �S i S j � − {(l + 1)M} −2 . Subsequently, because S i S j is the expected value of S i S j , it is evaluated by multiplying state S i S j by probability S n h,i S n h,j . Here, S x includes an indeterminate parameter c that characterizes the system as a Diophantine problem, as shown in Eq. (5) in Section "Agent distribution". We mentioned that c = l is required for normalization when we refer to S x as the probability; however, the parameter c remains unspecified when referring to S x as a system state. It is necessary to explicitly represent the parameter c when describing an arbitrary state S x because parameter c may be a major contributor to the observed nonlinear phenomena. Because S n h,i is already normalized, as in Eq. (6), we can express S i by S n h,i as S i ≈cS n h,i , where c = c/l . The mean S i S j can be expressed as (cS n h,i S n h,j ) 2 . In this stage, the covariance can be represented as σ ij = (cS n h,i S n h,j ) 2 − {(l + 1)M} −2 . Recall that the dependence of I h on M in case (C) was observed to approach that in case (A) as parameter α increased. Specifically, the surges of I h in a small area of M were observed to decrease as α increased, as shown in Fig. 4; agents were equally distributed among secondary nodes as the parameter α increased. This corresponds to the fact that the agent distribution in a stationary state gets closer to the uniform distribution as α increases. We can say that the correlations among different secondary nodes become negligible when agents are kept equally distributed among secondary nodes, compared to the case of a biased agent distribution. This is further explained In brief, the covariance decreases as α increases. To reflect this, we introduced a phenomenological scale parameter C d that controls the intensity of σ ij . Consequently, we obtained the following expression for variance σ ij : In summary, our theoretical model has scale parameters of σ 0 for case (A), µ for case (B), and ( c , C d , µ ) for case (C). We determined these scale parameters by fitting the experimental data because the scale of the system state is indeterminate, as represented by parameter c in Eq. (5). Specifically, in case (A), we first determined σ 0 by fitting the case of N v = 1, 000 using least squares and used the same value when plotting model value of I h in other cases of N v = 2, 000 , N v = 5, 000 , N v = 10, 000 , and N v = 30, 000 . In case (B), we determined µ in a manner similar to that in case (A). In case (C), we searched for the optimal condition of ( c , C d , µ ) that reproduces the measurements shown in Fig. 4. In addition, we investigated the dependence of the phenomenological scale parameter C d on parameter α. Figure 6a shows the comparisons of our theoretical models from Eq. (13) to Eq. (15) with the measurements shown in Fig. 4. Each solid line corresponds to the result with the same color and symbol in Fig. 4. The theoretical models show good agreement with measurements in all three cases from (A) to (C). The resulting σ 0 obtained by fitting the measurement in the case of (A) with N v = 1, 000 was 1.316 × 10 2 in this test. The other plots for various N v in case (A) were obtained using the determined σ 0 . As a result of fittings, the optimal values of µ for case (B) and (c, µ) for case (C) were obtained as 3.072 × 10 −4 and ( 1.0 × 10 3 , 2.258 × 10 −4 ), respectively. Note that the parameter ǫ in Eq. (12) was confirmed to be zero.
The model in case (B) calculates only the diagonal components of the variance-covariance matrix E . On the other hand, the model in case (C) calculates the non-diagonal components of E as well as the diagonal components using a generic statistical relationship: σ ij = �S i S j � − η i η j since it is expected that the statistics in different secondary nodes correlate from each other as a result of agents mutually referring to the congestion rate of the secondary nodes. In this respect, we can confirm from Fig. 6a that the contributions from the non-diagonal components of E reproduce the characteristic surges of I h in small areas of the parameter M and converge to the case of uniform preferences. This suggests that mutually referring to the congestion information primarily causes the surges of I h , that is, the deterioration of uniformity. In other words, referencing congestion information can (15)  www.nature.com/scientificreports/ make the uniformity worse rather than better if the degree of reference to the congestion information is insufficient in small systems. As mentioned in the Introduction section, the network becomes endogenous when the system provides agents with congestion information to balance agents among nodes because the resulting agent distribution feeds back into the input congestion rates; hence, our findings can be helpful when controlling such endogenous traffic networks that provide agents with congestion information as traffic information in real-world cases. In addition, Fig. 6b shows a plot of the values of C d for different values of α in case (C), where D = 1.3 × 10 −6 . It was found that parameter C d is approximately proportional to the reciprocal of the square of parameter α . We can confirm the following from Fig. 6b. When agents avoid congestion linearly to the congestion rate with the scale of α , the non-diagonal components of the variance-covariance matrix varies approximately depending on the inverse square of the parameter α . This finding also serves as a guide for applying the results to crowd management at event venues. Consider a situation where visitors are dispersed across several local areas within an event venue. Event managers are typically expected to ensure that visitors are equally distributed across local areas to mitigate the risk of infection or accidents caused by dense crowds. However, determining how forcefully visitors need to be guided is difficult, since the extent to which visitors will obediently follow a guide is unknown. Our results indicate that if visitors respond linearly to congestion information in proportion to parameter α , the uniformity of visitor distribution improves as α increases. This effect can be attributed to the fact that the degree of mutual reference, i.e., the potential source of the surge in imbalance, decreases approximately depending on the inverse square of parameter α . This knowledge can be used as an indicator for event managers to successfully distribute visitors by comprehending how strongly visitors respond to the information provided to them.
Consequently, our theoretical model accurately describes the mechanism of the target system; our analysis corroborated that the balance between the equalization of network usage by avoiding congestion and the amplification of covariance caused by a mutual reference to congestion information determines the overall uniformity of a network with star topology.
Traveling time in Scenario-2. The most straightforward way to reproduce the travel time steps T s in Scenario-2 is to assume that the times for moving from the primary to the secondary node and returning from the secondary node are proportional to the reciprocals of the hopping and return probabilities. For example, when uniform preferences with α = 0 , the hopping probability is given by 1/M for each secondary node, and the returning probability is given by lA, as explained in Section "Model". Hence, the travel time step required to visit a cell, t s , can be described as t s = ηM −1 + ζ lA −1 , where η and ζ are the coefficients. By summing t s from 1 to M, we can obtain the expression of T s as follows: In the case of weighted preferences, we can calculate the travel time steps T s similarly for uniform preferences after replacing 1/M with the inverse of the probability S n h,i . Specifically, When α = 0 , Eq. (17) can be further broken down as where we use the mathematical relationship of k i=1 1/i ≈ lnk + γ + 1/2k ; γ represents the Euler-Mascheroni constant, which nearly equals 0.577215664 53 .
In Fig. 6c, the solid red line with the star symbol and the solid blue line with the circle symbol, respectively, indicate the measurements of uniform preferences and weighted preferences with α = 0 . The dashed lines represent the fitting lines obtained using Eq. (17) and Eq. (18). We confirmed that our model describes the variation in the traveling time step in Scenario-2. Figure 6d shows the dependence of the ratio of η or ζ on the sum of the parameters α in the case of the weighted preferences obtained by fitting the measurements by our model in Eq. (18) for all cases of α = 0 , where C d represents η/(η + ζ ) or ζ /(η + ζ ) . It was confirmed that the relative ratio of ζ , which is the coefficient of the return time, increases, and the relative ratio of η , which is the coefficient of the hopping time from primary to secondary, decreases; the time required for the outward trip was confirmed to decrease as a result of agents avoiding congestion due to the increase in parameter α.
In addition, we can show the exponentiality and linearity of time steps T s in the respective cases of weighted preferences with α = 0 and α = 0 from a different angle as follows: As mentioned in Section "Model", the total probability is normalized for all the secondary nodes, and agents lose the probabilities of choosing the alreadyvisited secondary nodes; if an already-visited node is selected in a trial, the system discards the trial. Therefore, the probability of hopping to any of the secondary nodes at the time of kth visiting, P u , is given as a complement of the sum of the probabilities of hopping to one of the secondary nodes that have been already visited before the kth visiting, P a , that is, P u + P a = 1 , where the subscripts u and a indicate the "unvisited" and "already-visited" nodes, respectively. When an agent avoids congestion owing to α = 0 , the probability of hopping to one of the (16)  www.nature.com/scientificreports/ secondary nodes becomes equalized and can be approximately expressed as 1/M; P a at the time of the kth visit (k = 1, 2 · · · , M) is given as k/M, and P u is expressed as (1 − k/M) . Here, we assume that the average time step required to hop from the primary node to one of the secondary nodes at the kth visit, t k s , is proportional to the inverse of the value of P u at the kth visit; t k s can be expressed as η(1 − k/M) −1 . Accordingly, the total time steps required for hopping from the primary node to the secondary nodes, T h s , can be obtained by summing up t k s for k; A simple calculation leads to: Here, the summation in Eq. (19) can be approximated as M k=1 1/k ≈ lnM + γ + 1/2M , similar to Eq. (18). We refer to the summation in Eq. (19) as 1 . As we can observe from Euler's approximation, 1 shows a moderate dependence on the parameter M, as shown by the blue line in Fig. 7a. Consequently, the component ηM of Eq. (19) is confirmed to be dominant in the M-dependence of T h s , as indicated by the blue line with the circle symbol in Fig. 7b. Accordingly, the linearity of T s is confirmed for weighted preferences with α = 0.
Meanwhile, when α = 0 , agents move to one of the secondary nodes only according to the fixed preference i for the ith secondary node. Because the secondary node with a larger weight is preferentially visited, P a at the time of the kth visit (k = 1, 2 · · · , M) can be expressed as k j=1 (M − j + 1)/W as a typical case, where W is the sum of the preference i from 1 to M, which is M(M + 1)/2 . In this case, P u is expressed as 1 − k j=1 (M − j + 1)/W . We assume that the average time steps t k s is proportional to the inverse of the value of P u at the kth visit, similar to Eq. (19). Then, T h s is calculated by summing t k s for k; the resulting expression of T h s is obtained as follows: Here, we refer to the summation of Eq. (20) as 2 : The dashed red line in Fig. 7(a) shows the dependence of 2 on the parameter M, which becomes sluggish at approximately M = 25 , and then converges to one. Consequently, component ηM(M + 1) in Eq. (20) is confirmed to be dominant in the M-dependence of T h s , as indicated by the red line with the square symbol in Fig. 7b. Accordingly, the exponentiality of T s is confirmed for weighted preferences, with α = 0 . Notably, we describe all the cases observed in Fig. 5b using the theoretical models represented by Eq. (16) into Eq. (20): Uniform preferences with α = 0 , weighted preferences with α = 0 , and weighted preferences with α = 0.
A star topology is often used as a conceptual decision-making model by individuals with multiple choices. An example of the application of Scenario-2 is a health examination. In this case, the agents represent patients and M secondary nodes represent examination booths corresponding to M different examination items. Additionally, w i represents the priority of the ith examination item. When α = 0 , the patient tries to visit the examination booths in a predetermined order. Whereas, when α = 0 , the patient will preferentially select and move to the vacant booths. Our results indicate that by considering congestion information and allowing patients to be preferentially selected for examination in vacant examination sites, we can linearize the time it takes for everyone to complete their examination. On the other hand, from another perspective, the relationship between P u and P a is similar to that between the fatigue from visiting congested nodes and the motivation to leave the primary node; P a increases by approximately 1/M every time returning to the primary node when the preferences were weighted with α = 0 or uniform. By contrast, P a is k j=1 (M − j + 1)/W at the time of the kth visit when weighted preferences with . www.nature.com/scientificreports/ α = 0 because every agent tends to visit the nodes with higher preferences. Namely, the nodes with higher preferences get more congested. If we assume that the fatigue from visiting a node is proportional to the degree of congestion in the node, P a represents accumulated fatigue due to visiting congested nodes. Our results suggest that fatigue moderately increases when α = 0 because agents can avoid congestion; however, it drastically builds up when α = 0 because they face congestion. The discussion here suggests that our results can be applied to problems in a variety of fields by interpreting the physical meaning of parameters P u and P a from different angles. Although this paper focuses on basic star topologies, our findings contribute to the understanding of complex networks. Generally, several complex networks can be decomposed into multiple superimposed or connected star topologies. Consider an undirected weighted random graph with M + 1 nodes. Connecting the ith node to another stochastically selected node from the remaining M nodes to generate an edge is equivalent to creating a combination of primary and secondary nodes in our system. Therefore, in the case where weights are assigned to nodes, in a manner similar to that of Scenario-1 of this study, the nonlinearity shown in Fig. 4 might be observed in the random graph at an equilibrium state. Previously, the relationship between the preferences of agents and the topology of networks has been studied in statistical physics involving complex networks 54 ; our study can contribute not only to a single-star topology but also to complex networks.

Conclusion
The importance of fundamental research on network topologies has been widely acknowledged in many scientific areas. This study examined the effect of congestion avoidance of agents given congestion information on optimizing traffic in a star network topology. We investigated the dynamics of a stochastic transportation network in which each agent at the primary node stochastically selects one of the secondary nodes by referring to the declining preferences based on the congestion rate of the secondary nodes. We examined the following two scenarios: each agent can repeatedly access the same secondary node, or each agent can access each secondary node only once. We refer to the former scenario as Scenario-1 and the latter as Scenario-2. We measured the uniformity of agent distribution in the stationary state in Scenario-1, and we measured the travel time for all agents visiting all the nodes in Scenario-2. The findings of this study are summarized as follows.
In Scenario-1, the uniformity of agent distribution was found to show three types of nonlinear dependences on the increase of nodes. We found that multivariate statistics describe these characteristic dependences well, revealing the existence of the optimal number of secondary nodes that makes the agent distribution most uniform. Our theoretical analysis corroborates that the balance between the equalization of network usage by avoiding congestion and the amplification of the covariance caused by mutual reference to congestion information determines the overall uniformity of the star network topology. This further indicates the following: Referencing congestion information can make the uniformity of networks worse rather than better if the degree of reference to the congestion information is insufficient in small systems; this finding can be helpful when controlling the endogenous traffic networks that provide agents with congestion information as traffic information in real-world cases. In addition, our analysis shows that if agents linearly respond to congestion information in proportion to a scale parameter, the uniformity of agent distribution improves as the parameter increases because the degree of mutual reference, i.e., the potential source of the surge in imbalance, decreases approximately proportional to the inverse square of the parameter. This knowledge can be used as an indicator for event managers to successfully distribute visitors among local areas in the event venue by looking at how strongly visitors respond to the information provided.
In Scenario-2, we discovered that congestion avoidance linearizes the travel time irrespective of the degree of reference to the congestion information. In contrast, the travel time increases exponentially with the number of secondary nodes when not referring to congestion information at all. Our theoretical models clearly explain the linearity and exponentiality in their respective cases. Using the case of a health examination as an example for Scenario 2, we demonstrated that allowing patients to be preferentially selected for examination in vacant examination sites can linearize the time it takes for everyone to complete their examination. In future work, the physical parameters will be interpreted from different angles in both scenarios to develop further applications. Consequently, we successfully described the optimization effect of congestion-avoiding behavior on the collective dynamics of agents in star topologies.